DIP-Python tutorials for image processing and machine learning(68)-SVM
学习自 Youtube 博主 DigitalSreeni。
文字数:---
正文
68 - Quick introduction to Support Vector Machines -SVM
sklearn.svm.SVC — scikit-learn 1.2.0 documentation
C-Support Vector Classification.
-
The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. For large datasets consider using
LinearSVC
orSGDClassifier
instead, possibly after aNystroem
transformer.- 该实现基于 libsvm。拟合时间至少与样本数量成二次方比例,超过数万个样本可能不切实际。对于大型数据集,考虑使用 LinearSVC 或 SGDClassifier,可能在 Nystroem transformer 之后。
-
The multiclass support is handled according to a one-vs-one scheme.
- 多类支持根据一对一方案进行处理。
-
For details on the precise mathematical formulation of the provided kernel functions and how
gamma
,coef0
anddegree
affect each other, see the corresponding section in the narrative documentation: Kernel functions.- 有关所提供内核函数的精确数学公式以及
gamma
、coeff0
和degree
如何相互影响的详细信息,请参阅叙事文档中的相应部分:内核函数。
- 有关所提供内核函数的精确数学公式以及
Read more in the User Guide.
**Parameters 参数: **
C: float, default=1.0
- Regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty.
- 正则化参数。正则化的强度与C成反比。必须严格为正。惩罚是l2的平方惩罚。
kernel: {‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’} or callable, default=’rbf’
- Specifies the kernel type to be used in the algorithm. If none is given, ‘rbf’ will be used. If a callable is given it is used to pre-compute the kernel matrix from data matrices; that matrix should be an array of shape
(n_samples, n_samples)
.- 指定要在算法中使用的内核类型。如果没有给出,将使用“rbf”。如果给定了一个可调用函数,则它用于从数据矩阵中预计算内核矩阵;该矩阵应该是形状“(n_samples,n_samples)”的数组。
degree: int, default=3
- Degree of the polynomial kernel function (‘poly’). Must be non-negative. Ignored by all other kernels.
- 多项式核函数的次数(‘poly’)。必须为非负。被所有其他内核忽略。(只在内核类型为 poly 时有效)
gamma:{‘scale’, ‘auto’} or float, default=’scale’*
-
Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.
- “rbf”、“poly”和“sigmoid”的核系数。
-
if
gamma='scale'
(default) is passed then it uses 1 / (n_features * X.var()) as value of gamma, -
if ‘auto’, uses 1 / n_features
-
if float, must be non-negative.
Changed in version 0.22: The default value of gamma
changed from ‘auto’ to ‘scale’.
coef0: float, default=0.0
- Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
- 核函数中的独立项。它只在“poly”和“sigmoid”中有意义。
shrinking: bool, default=True
- Whether to use the shrinking heuristic. See the User Guide.
- 是否使用收缩启发式。
probability: bool, default=False
- Whether to enable probability estimates. This must be enabled prior to calling
fit
, will slow down that method as it internally uses 5-fold cross-validation, andpredict_proba
may be inconsistent withpredict
. Read more in the User Guide.- 是否启用概率估计。这必须在调用“fit”之前启用,这会减慢该方法的速度,因为它内部使用5倍交叉验证,而且“predict_proba”可能与“predict”不一致。
tol: float, default=1e-3
- Tolerance for stopping criterion.
- 停止标准公差。
cache_size: float, default=200
- Specify the size of the kernel cache (in MB).
- 指定内核缓存的大小(以MB为单位)。
class_weight: dict or ‘balanced’, default=None
- Set the parameter C of class i to class_weight[i]*C for SVC. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as
n_samples / (n_classes * np.bincount(y))
.- 对于SVC,将类i的参数C设置为class_weight[i]*C。如果没有给出,所有的课程都应该有一个权重。“平衡”模式使用y值自动调整与输入数据中的类频率成反比的权重,如
n_samples/(n_classes*np.bincount(y))
。
- 对于SVC,将类i的参数C设置为class_weight[i]*C。如果没有给出,所有的课程都应该有一个权重。“平衡”模式使用y值自动调整与输入数据中的类频率成反比的权重,如
verbose: bool, default=False
- Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
- 启用详细输出。注意,此设置利用了libsvm中的每进程运行时设置,如果启用该设置,则在多线程上下文中可能无法正常工作。
max_iter: int, default=-1
- Hard limit on iterations within solver, or -1 for no limit.
- 解算器内迭代的硬限制,或-1表示无限制。
decision_function_shape: {‘ovo’, ‘ovr’}, default=’ovr’
Whether to return a one-vs-rest (‘ovr’) decision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2). However, note that internally, one-vs-one (‘ovo’) is always used as a multi-class strategy to train models; an ovr matrix is only constructed from the ovo matrix. The parameter is ignored for binary classification.
Changed in version 0.19: decision_function_shape is ‘ovr’ by default.
New in version 0.17: decision_function_shape=’ovr’ is recommended.
Changed in version 0.17: Deprecated decision_function_shape=’ovo’ and None.
break_ties: bool, default=False
If true, decision_function_shape='ovr'
, and number of classes > 2, predict will break ties according to the confidence values of decision_function; otherwise the first class among the tied classes is returned. Please note that breaking ties comes at a relatively high computational cost compared to a simple predict.
New in version 0.22.
random_state: int, RandomState instance or None, default=None
- Controls the pseudo random number generation for shuffling the data for probability estimates. Ignored when
probability
is False. Pass an int for reproducible output across multiple function calls. See Glossary.- 控制伪随机数的生成,以便对概率估计的数据进行混洗。当“probability”为False时忽略。在多个函数调用之间传递一个int以获得可复制的输出。
**Attributes 属性: **
class_weight_: ndarray of shape (n_classes,)
- Multipliers of parameter C for each class. Computed based on the class_weight parameter.
- 每个类的参数 C 的乘数。基于 class_weight 参数计算。
classes_: ndarray of shape (n_classes,)
The classes labels.
coef_: ndarray of shape (n_classes * (n_classes - 1) / 2, n_features)
- Weights assigned to the features when kernel=“linear”.
- 当 kernel=“linear”时,分配给特征的权重。
dual_coef_: ndarray of shape (n_classes -1, n_SV)
- Dual coefficients of the support vector in the decision function (see Mathematical formulation), multiplied by their targets. For multiclass, coefficient for all 1-vs-1 classifiers. The layout of the coefficients in the multiclass case is somewhat non-trivial. See the multi-class section of the User Guide for details.
- 决策函数中支持向量的对偶系数(参见数学公式)乘以其目标。对于多类别,所有 1-vs-1 分类器的系数。在多类情况下,系数的布局有些不平凡。有关详细信息,请参阅《用户指南》的多类部分。
fit_status_: int
- 0 if correctly fitted, 1 otherwise (will raise warning)
- 如果安装正确,则为 0,否则为 1(将发出警告)
intercept_: ndarray of shape (n_classes * (n_classes - 1) / 2,)
- Constants in decision function.
- 决策函数中的常量。
n_features_in_: int
- Number of features seen during fit.
- 装配过程中看到的特征数量。
New in version 0.24.
feature_names_in_: ndarray of shape (n_features_in_,)
- Names of features seen during fit. Defined only when X has feature names that are all strings.
- 装配过程中看到的特征名称。仅当 X 具有全部字符串的要素名称时才定义。
New in version 1.0.
n_iter_: ndarray of shape (n_classes * (n_classes - 1) // 2,)
- Number of iterations run by the optimization routine to fit the model. The shape of this attribute depends on the number of models optimized which in turn depends on the number of classes.
- 优化例程为拟合模型而运行的迭代次数。该属性的形状取决于优化的模型数量,而优化的模型又取决于类的数量。
New in version 1.1.
support_: ndarray of shape (n_SV)
- Indices of support vectors.
- 支持向量索引。
support_vectors_: ndarray of shape (n_SV, n_features)
Support vectors.
n_support_: ndarray of shape (n_classes,), dtype=int32
Number of support vectors for each class.
probA_: ndarray of shape (n_classes * (n_classes - 1) / 2)
Parameter learned in Platt scaling when probability=True.
- 当概率为 True 时,在 Platt 缩放中学习的参数。
probB_: ndarray of shape (n_classes * (n_classes - 1) / 2)
Parameter learned in Platt scaling when probability=True.
shape_fit_: tuple of int of shape (n_dimensions_of_X,)
Array dimensions of training vector X.
68b - SVM vs. Random Forest for image segmentation
1 |
|
1 |
|
C:\Users\gzjzx\anaconda3\lib\site-packages\sklearn\svm\_base.py:1225: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
warnings.warn(
SVM
的速度要比 随机森林
慢得多
1 |
|
Accuracy= 0.9525666606203519
1 |
|
虽然这并不意味着支持向量机没用,但我的意思是,对于像素分割,支持向量机可能不是正确的选择,但是如果你有图像分类,支持向量机实际上做得很好。